Main modifications with respect to previous versions:
2020-11-30
2020-11-06
2020-10-30
2020-09-24
We simulate a two-arm trial with two binary endpoints according to the following parameters:
The correlation between the endpoints is assumed to be equal for both groups. We note that since the correlation is bounded between two values defined by (\(p_1^{(0)},p_2^{(0)},OR_1,OR_2\)), not all the combinations of (\(p_1^{(0)},p_2^{(0)},OR_1,OR_2,\rho\)) are possible. We calculate the correlation bounds for each combination of (\(p_1^{(0)},p_2^{(0)},OR_1,OR_2\)) and discard those scenarios in which the correlation is not within the valid range. The total number of scenarios is \(183\).
We consider the composite binary endpoint formed by endpoints 1 and 2. Based on the parameters (\(p_1^{(0)},p_2^{(0)},OR_1,OR_2,\rho\)), we compute:
We assume that the endpoint 1 is more relevant for the clinical question than endpoint 2.
ARE Method:
We consider the Asymptotic Relative Efficiency (ARE) method to quantify the efficiency of using the composite endpoint over the endpoint 1 as primary endpoint of the trial. Based on the parameters (\(p_1^{(0)},p_2^{(0)},OR_1,OR_2,\rho\)), we compute:
Ratio Sample sizes:
We calculate the sample size to have \(0.80\) power to detect an effect of \(OR_1\) on the endpoint 1, given the probability \(p_1^{(0)}\) under the control group, at significance level \(\alpha = 0.05\), say \(n_1\).
We also calculate the sample size to have \(0.80\) power to detect an effect of \(OR_*\) on the composite endpoint, given the probability \(p_*^{(0)}\) under the control group, at significance level \(\alpha = 0.05\), say \(n_*\). Note that for calculating the sample size for the composite endpoint have been used the values of the parameters (\(p_1^{(0)},p_2^{(0)},OR_1,OR_2,\rho\)).
The decision criteria is the ratio of the two sample sizes:
\[n_1 / n_*\]
so that if the ratio is smaller than \(1\), we use the relevant endpoint; whereas if the ratio is larger than \(1\), we use the composite endpoint.
We evaluate through simulations the statistical power and significance level under different situations and using different primary endpoints.
For each scenario, we compare the two groups based on three endpoints:
Remark: we simulate \(100000\) trials for each scenario and for each endpoint considered as primary endpoint;
For each scenario, we evaluate the following cases:
Under alternative hypothesis (\(H_1\)):
Under the null hypothesis (\(H_0\)):
When simulating under \(H_0\), we consider \(OR_1=OR_2=1\); whereas when simulating under \(H_1\), we consider the following situations:
We consider two different approaches for estimating the correlation between the endpoints 1 and 2.
Blinded approach:
Based on the observed responses in the pooled sample, we estimate the probabilities \(p_1\), \(p_2\), and \(p_*\), where \(p_k=\pi p_k^{(0)} + (1-\pi)p_k^{(1)}\) for \(k=1,2,*\) and \(\pi=n^{(0)}/n\). Assuming the expected effects for the endpoints 1 and 2 (say \(OR_1\) and \(OR_2\)) pre-specified in advance, we obtain estimates of the probabilities of each composite component under the control group \(p_1^{(0)},p_2^{(0)}\) and subsequently the estimates of the probabilities under the treatment group \(p_1^{(1)},p_2^{(1)}\).
Using the estimated probabilities for each composite component in each group (\(p_1^{(0)},p_2^{(0)},p_1^{(1)},p_2^{(1)}\)) and using the estimated pooled probability of the composite endpoint (\(p_*\)), we get an estimate of the correlation.
We calculate the ARE by using the estimated probabilities \(p_1^{(0)},p_2^{(0)}\) and the estimated correlation.
Unblinded approach:
Based on the observed responses in patients in the control group, we estimate the probabilities \(p_1^{(0)},p_2^{(0)},p_*^{(0)}\). Then, using these estimated probabilities, we obtain an estimate of the correlation between the Endpoints 1 and 2 in the control group \(\rho^{(0)}\).
We do the same for the treatment group, and then we obtain an estimate of the correlation in the treatment group \(\rho^{(1)}\).
We calculate the ARE by using the estimated probabilities \(p_1^{(0)},p_2^{(0)}\) and the mean of the two estimates of the correlation.
As mentioned before, Pearson correlation coefficient between two binary endpoints takes values between two bounds: denote by \(B_L(\cdot)\) the lower bound, and by \(B_U(\cdot)\) the upper bound. Both \(B_L(\cdot)\) and \(B_U(\cdot)\) are functions of the marginal probabilities of the binary endpoints.
Thus, in our context, the correlation between endpoints 1 and 2 in the control group lies within \(B_L(p_1^{(0)},p_2^{(0)})\) and \(B_U(p_1^{(0)},p_2^{(0)})\); whereas the correlation between them in the treatment group takes values between \(B_L(p_1^{(1)},p_2^{(1)})\) and \(B_U(p_1^{(1)},p_2^{(1)})\).
Since we assume that the correlation is the same in both treatment groups, we require that:
\[ max(B_L(p_1^{(0)},p_2^{(0)}),B_L(p_1^{(1)},p_2^{(1)})) \leq \rho \leq min(B_U(p_1^{(0)},p_2^{(0)}),B_U(p_1^{(1)},p_2^{(1)})) \]
In the estimation approaches explained above, we compute the correlation bounds based on the estimated probabilities and force the estimated correlation to take values within this interval. We notice that this interval can be different from the one calculated based on the true probabilities (those used to run the simulations).
The sample size used for the simulation is calculated to have \(0.80\) power to detect an effect of \(OR_1\) on the endpoint 1 at significance level \(\alpha = 0.05\).
We summarize the sample sizes used for the scenarios:
summary(dataset_H0False[,21])
## samplesize_e1
## Min. : 714
## 1st Qu.:1335
## Median :3340
## Mean :2886
## 3rd Qu.:3340
## Max. :6072
The following figure illustrates the empirical power we obtained using the composite endpoint (CE), the endpoint 1 (RE) or the selected endpoint (SE).
# subd=dataset#subset(dataset,is.na(dataset$Test_Power_ES_ub)==F)
power_data <- data.frame(Power=c(dataset_H0False$Test_Power_ES,dataset_H0False$Test_Power_ES_ub,
dataset_H0False$Test_Power_CE,dataset_H0False$Test_Power_RE),
Endpoint=c(rep("SE",length(dataset_H0False$Test_Power_ES)),
rep("SE_ub",length(dataset_H0False$Test_Power_ES_ub)),
rep("CE",length(dataset_H0False$Test_Power_CE)),
rep("RE",length(dataset_H0False$Test_Power_RE)))
)
ggplot(power_data, aes(x=Endpoint, y=Power)) + geom_boxplot()
For each combination of (\(p_1^{(0)},p_2^{(0)},OR_1,OR_2\)), we plot the empirical power for each design -Selected Endpoint with blinded estimation (SE), Selected Endpoint with unblinded estimation (SE_u), Composite Endpoint (CE), and Relevant Endpoint (RE)- according to the correlation \(\rho\). In the empirical powers for the Selected Endpoint, we plot the points in different colors depending on which endpoint would have been chosen using the ARE calculated from the true parameters (\(p_1^{(0)},p_2^{(0)},OR_1,OR_2,\rho\)).
For each combination of (\(p_1^{(0)},p_2^{(0)},OR_1,OR_2\)), we also present a table with:
p <- list()
enum = 1
it <- 1
for(i in 1:max(dataset_H0False$scenario)){
sub=subset(dataset_H0False,dataset_H0False$scenario==i & dataset_H0False$p_init==1.0)
#
p[[enum]] <-ggplot(sub, aes(x=corr, y=Test_Power_ES, color=as.factor(decision))) +
geom_point(size=2)+ ggtitle(paste("Scenario", dataset_H0False$scenario[it], "\n (p1,p2,OR1,OR2) \n=(", dataset_H0False$p0_e1[it],",",dataset_H0False$p0_e2[it],",",dataset_H0False$OR1[it],",",dataset_H0False$OR2[it],")"))+geom_point(size=2) + labs(y = "Empirical Power (ES)", x="Correlation", color="Decision") + coord_cartesian(ylim = c(0.60, 1))+ geom_path()+ theme(plot.title = element_text(size=9),legend.position = c(0.8, 0.2))
p[[enum+1]] <-ggplot(sub, aes(x=corr, y=Test_Power_ES_ub, color=as.factor(decision))) +
geom_point(size=2)+ ggtitle(paste("Scenario", dataset_H0False$scenario[it], "\n (p1,p2,OR1,OR2) \n=(", dataset_H0False$p0_e1[it],",",dataset_H0False$p0_e2[it],",",dataset_H0False$OR1[it],",",dataset_H0False$OR2[it],")"))+geom_point(size=2) + labs(y = "Empirical Power (ES, unblinded)", x="Correlation", color="Decision") + coord_cartesian(ylim = c(0.60, 1))+ geom_path()+ theme(plot.title = element_text(size=9),legend.position = c(0.8, 0.2))
#
p[[enum+2]] <-ggplot(sub, aes(x=corr, y=Test_Power_ES_SS, color=as.factor(ss_decision))) +
geom_point(size=2)+ ggtitle(paste("Scenario", dataset_H0False$scenario[it], "\n (p1,p2,OR1,OR2) \n=(", dataset_H0False$p0_e1[it],",",dataset_H0False$p0_e2[it],",",dataset_H0False$OR1[it],",",dataset_H0False$OR2[it],")"))+geom_point(size=2) + labs(y = "Empirical Power (ES)", x="Correlation", color="Decision SS") + coord_cartesian(ylim = c(0.60, 1))+ geom_path()+ theme(plot.title = element_text(size=9),legend.position = c(0.8, 0.2))
p[[enum+3]] <-ggplot(sub, aes(x=corr, y=Test_Power_ES_ubSS, color=as.factor(ss_decision))) +
geom_point(size=2)+ ggtitle(paste("Scenario", dataset_H0False$scenario[it], "\n (p1,p2,OR1,OR2) \n=(", dataset_H0False$p0_e1[it],",",dataset_H0False$p0_e2[it],",",dataset_H0False$OR1[it],",",dataset_H0False$OR2[it],")"))+geom_point(size=2) + labs(y = "Empirical Power (ES, unblinded)", x="Correlation", color="Decision SS") + coord_cartesian(ylim = c(0.60, 1))+ geom_path()+ theme(plot.title = element_text(size=9),legend.position = c(0.8, 0.2))
#
p[[enum+4]] <-ggplot(sub, aes(x=corr, y=Test_Power_CE)) +
geom_point(size=2)+ ggtitle(paste("Scenario", dataset_H0False$scenario[it], "\n (p1,p2,OR1,OR2) \n=(", dataset_H0False$p0_e1[it],",",dataset_H0False$p0_e2[it],",",dataset_H0False$OR1[it],",",dataset_H0False$OR2[it],")"))+geom_point(size=2) + labs(y = "Empirical Power (CE)", x="Correlation", color="Decision") + coord_cartesian(ylim = c(0.60, 1))+ geom_path()+ theme(plot.title = element_text(size=9))
p[[enum+5]] <-ggplot(sub, aes(x=corr, y=Test_Power_RE)) +
geom_point(size=2)+ ggtitle(paste("Scenario", dataset_H0False$scenario[it], "\n (p1,p2,OR1,OR2) \n=(", dataset_H0False$p0_e1[it],",",dataset_H0False$p0_e2[it],",",dataset_H0False$OR1[it],",",dataset_H0False$OR2[it],")"))+geom_point(size=2) + labs(y = "Empirical Power (RE)", x="Correlation", color="Decision") + coord_cartesian(ylim = c(0.60, 1))+ geom_path()+ theme(plot.title = element_text(size=9))
stable <- data.frame(Corr=sub$corr,
ARE=round(sub$ARE,2),"%CE"=round(100*sub$decision_ES,2),
"%CE_u"=round(100*sub$decision_ES_ub,2),check.names=FALSE)
p[[enum+6]] <- ggtexttable(stable, rows = NULL, theme = ttheme(base_style = "default", base_size = 9))
stable <- data.frame(Corr=sub$corr,
SS=round(sub$ss_ratio,2),"%CE"=round(100*sub$decision_ES_SS,2),
"%CE_u"=round(100*sub$decision_ES_ubSS,2),check.names=FALSE)
p[[enum+7]] <- ggtexttable(stable, rows = NULL, theme = ttheme(base_style = "default", base_size = 9))
enum=enum+8
it <- it + dim(subset(dataset_H0False,dataset_H0False$scenario==i))
}
marrangeGrob(p,ncol=4,nrow=1,top=NULL)
# ,heights = c(5, 10)
For each combination of (\(p_1^{(0)},p_2^{(0)},OR_1,OR_2\)), we plot the empirical power for each design -Selected Endpoint with blinded estimation (SE), Selected Endpoint with unblinded estimation (SE_u), Composite Endpoint (CE), and Relevant Endpoint (RE)- according to the correlation \(\rho\). In the empirical powers for the Selected Endpoint, we plot the points in different colors depending on which endpoint would have been chosen using the ARE calculated from the true parameters (\(p_1^{(0)},p_2^{(0)},OR_1,OR_2,\rho\)).
For each combination of (\(p_1^{(0)},p_2^{(0)},OR_1,OR_2\)), we also present a table with:
p <- list()
enum = 1
it <- 1
for(i in 1:max(dataset_H0False$scenario)){
sub=subset(dataset_H0False,dataset_H0False$scenario==i & dataset_H0False$p_init==0.5)
#
p[[enum]] <-ggplot(sub, aes(x=corr, y=Test_Power_ES, color=as.factor(decision))) +
geom_point(size=2)+ ggtitle(paste("Scenario", dataset_H0False$scenario[it], "\n (p1,p2,OR1,OR2) \n=(", dataset_H0False$p0_e1[it],",",dataset_H0False$p0_e2[it],",",dataset_H0False$OR1[it],",",dataset_H0False$OR2[it],")"))+geom_point(size=2) + labs(y = "Empirical Power (ES)", x="Correlation", color="Decision") + coord_cartesian(ylim = c(0.60, 1))+ geom_path()+ theme(plot.title = element_text(size=9),legend.position = c(0.8, 0.2))
p[[enum+1]] <-ggplot(sub, aes(x=corr, y=Test_Power_ES_ub, color=as.factor(decision))) +
geom_point(size=2)+ ggtitle(paste("Scenario", dataset_H0False$scenario[it], "\n (p1,p2,OR1,OR2) \n=(", dataset_H0False$p0_e1[it],",",dataset_H0False$p0_e2[it],",",dataset_H0False$OR1[it],",",dataset_H0False$OR2[it],")"))+geom_point(size=2) + labs(y = "Empirical Power (ES, unblinded)", x="Correlation", color="Decision") + coord_cartesian(ylim = c(0.60, 1))+ geom_path()+ theme(plot.title = element_text(size=9),legend.position = c(0.8, 0.2))
#
p[[enum+2]] <-ggplot(sub, aes(x=corr, y=Test_Power_ES_SS, color=as.factor(ss_decision))) +
geom_point(size=2)+ ggtitle(paste("Scenario", dataset_H0False$scenario[it], "\n (p1,p2,OR1,OR2) \n=(", dataset_H0False$p0_e1[it],",",dataset_H0False$p0_e2[it],",",dataset_H0False$OR1[it],",",dataset_H0False$OR2[it],")"))+geom_point(size=2) + labs(y = "Empirical Power (ES)", x="Correlation", color="Decision SS") + coord_cartesian(ylim = c(0.60, 1))+ geom_path()+ theme(plot.title = element_text(size=9),legend.position = c(0.8, 0.2))
p[[enum+3]] <-ggplot(sub, aes(x=corr, y=Test_Power_ES_ubSS, color=as.factor(ss_decision))) +
geom_point(size=2)+ ggtitle(paste("Scenario", dataset_H0False$scenario[it], "\n (p1,p2,OR1,OR2) \n=(", dataset_H0False$p0_e1[it],",",dataset_H0False$p0_e2[it],",",dataset_H0False$OR1[it],",",dataset_H0False$OR2[it],")"))+geom_point(size=2) + labs(y = "Empirical Power (ES, unblinded)", x="Correlation", color="Decision SS") + coord_cartesian(ylim = c(0.60, 1))+ geom_path()+ theme(plot.title = element_text(size=9),legend.position = c(0.8, 0.2))
#
p[[enum+4]] <-ggplot(sub, aes(x=corr, y=Test_Power_CE)) +
geom_point(size=2)+ ggtitle(paste("Scenario", dataset_H0False$scenario[it], "\n (p1,p2,OR1,OR2) \n=(", dataset_H0False$p0_e1[it],",",dataset_H0False$p0_e2[it],",",dataset_H0False$OR1[it],",",dataset_H0False$OR2[it],")"))+geom_point(size=2) + labs(y = "Empirical Power (CE)", x="Correlation", color="Decision") + coord_cartesian(ylim = c(0.60, 1))+ geom_path()+ theme(plot.title = element_text(size=9))
p[[enum+5]] <-ggplot(sub, aes(x=corr, y=Test_Power_RE)) +
geom_point(size=2)+ ggtitle(paste("Scenario", dataset_H0False$scenario[it], "\n (p1,p2,OR1,OR2) \n=(", dataset_H0False$p0_e1[it],",",dataset_H0False$p0_e2[it],",",dataset_H0False$OR1[it],",",dataset_H0False$OR2[it],")"))+geom_point(size=2) + labs(y = "Empirical Power (RE)", x="Correlation", color="Decision") + coord_cartesian(ylim = c(0.60, 1))+ geom_path()+ theme(plot.title = element_text(size=9))
stable <- data.frame(Corr=sub$corr,
ARE=round(sub$ARE,2),"%CE"=round(100*sub$decision_ES,2),
"%CE_u"=round(100*sub$decision_ES_ub,2),check.names=FALSE)
p[[enum+6]] <- ggtexttable(stable, rows = NULL, theme = ttheme(base_style = "default", base_size = 9))
stable <- data.frame(Corr=sub$corr,
SS=round(sub$ss_ratio,2),"%CE"=round(100*sub$decision_ES_SS,2),
"%CE_u"=round(100*sub$decision_ES_ubSS,2),check.names=FALSE)
p[[enum+7]] <- ggtexttable(stable, rows = NULL, theme = ttheme(base_style = "default", base_size = 9))
enum=enum+8
it <- it + dim(subset(dataset_H0False,dataset_H0False$scenario==i))
}
marrangeGrob(p,ncol=4,nrow=1,top=NULL)
# ,heights = c(5, 10)
The sample size used for the simulation is calculated to have \(0.80\) power to detect an effect of \(OR_*\) on the composite endpoint at significance level \(\alpha = 0.05\). The odds ratio \(OR_*\) is computed based on the parameters of the composite components and assuming correlation equal 0.
We summarize the sample sizes used for the scenarios:
# summary(dataset_H0False_ss[,c(21,22)])
summary(dataset_H0False_ss[,c(18,22)])
## OR_ce samplesize_ce
## Min. :0.6321 Min. : 619.6
## 1st Qu.:0.6850 1st Qu.:1063.2
## Median :0.7522 Median :1663.4
## Mean :0.7325 Mean :1868.3
## 3rd Qu.:0.7842 3rd Qu.:2429.6
## Max. :0.7995 Max. :5201.4
The following figure illustrates the empirical power we obtained using the composite endpoint (CE), the endpoint 1 (RE) or the selected endpoint (SE).
# subd=dataset#subset(dataset,is.na(dataset$Test_Power_ES_ub)==F)
power_data <- data.frame(Power=c(dataset_H0False_ss$Test_Power_ES,dataset_H0False_ss$Test_Power_ES_ub,
dataset_H0False_ss$Test_Power_CE,dataset_H0False_ss$Test_Power_RE),
Endpoint=c(rep("SE",length(dataset_H0False_ss$Test_Power_ES)),
rep("SE_ub",length(dataset_H0False_ss$Test_Power_ES_ub)),
rep("CE",length(dataset_H0False_ss$Test_Power_CE)),
rep("RE",length(dataset_H0False_ss$Test_Power_RE)))
)
ggplot(power_data, aes(x=Endpoint, y=Power)) + geom_boxplot()
For each combination of (\(p_1^{(0)},p_2^{(0)},OR_1,OR_2\)), we plot the empirical power for each design -Selected Endpoint with blinded estimation (SE), Selected Endpoint with unblinded estimation (SE_u), Composite Endpoint (CE), and Relevant Endpoint (RE)- according to the correlation \(\rho\). In the empirical powers for the Selected Endpoint, we plot the points in different colors depending on which endpoint would have been chosen using the ARE calculated from the true parameters (\(p_1^{(0)},p_2^{(0)},OR_1,OR_2,\rho\)).
For each combination of (\(p_1^{(0)},p_2^{(0)},OR_1,OR_2\)), we also present a table with:
p <- list()
enum = 1
it <- 1
for(i in 1:max(dataset_H0False_ss$scenario)){
sub=subset(dataset_H0False_ss,dataset_H0False_ss$scenario==i & dataset_H0False_ss$p_init==1.0)
#
p[[enum]] <-ggplot(sub, aes(x=corr, y=Test_Power_ES, color=as.factor(decision))) +
geom_point(size=2)+ ggtitle(paste("Scenario", dataset_H0False_ss$scenario[it], "\n (p1,p2,OR1,OR2) \n=(", dataset_H0False_ss$p0_e1[it],",",dataset_H0False_ss$p0_e2[it],",",dataset_H0False_ss$OR1[it],",",dataset_H0False_ss$OR2[it],")"))+geom_point(size=2) + labs(y = "Empirical Power (ES)", x="Correlation", color="Decision") + coord_cartesian(ylim = c(0.60, 1))+ geom_path()+ theme(plot.title = element_text(size=9),legend.position = c(0.8, 0.2))
p[[enum+1]] <-ggplot(sub, aes(x=corr, y=Test_Power_ES_ub, color=as.factor(decision))) +
geom_point(size=2)+ ggtitle(paste("Scenario", dataset_H0False_ss$scenario[it], "\n (p1,p2,OR1,OR2) \n=(", dataset_H0False_ss$p0_e1[it],",",dataset_H0False_ss$p0_e2[it],",",dataset_H0False_ss$OR1[it],",",dataset_H0False_ss$OR2[it],")"))+geom_point(size=2) + labs(y = "Empirical Power (ES, unblinded)", x="Correlation", color="Decision") + coord_cartesian(ylim = c(0.60, 1))+ geom_path()+ theme(plot.title = element_text(size=9),legend.position = c(0.8, 0.2))
#
p[[enum+2]] <-ggplot(sub, aes(x=corr, y=Test_Power_ES_SS, color=as.factor(ss_decision))) +
geom_point(size=2)+ ggtitle(paste("Scenario", dataset_H0False_ss$scenario[it], "\n (p1,p2,OR1,OR2) \n=(", dataset_H0False_ss$p0_e1[it],",",dataset_H0False_ss$p0_e2[it],",",dataset_H0False_ss$OR1[it],",",dataset_H0False_ss$OR2[it],")"))+geom_point(size=2) + labs(y = "Empirical Power (ES)", x="Correlation", color="Decision SS") + coord_cartesian(ylim = c(0.60, 1))+ geom_path()+ theme(plot.title = element_text(size=9),legend.position = c(0.8, 0.2))
p[[enum+3]] <-ggplot(sub, aes(x=corr, y=Test_Power_ES_ubSS, color=as.factor(ss_decision))) +
geom_point(size=2)+ ggtitle(paste("Scenario", dataset_H0False_ss$scenario[it], "\n (p1,p2,OR1,OR2) \n=(", dataset_H0False_ss$p0_e1[it],",",dataset_H0False_ss$p0_e2[it],",",dataset_H0False_ss$OR1[it],",",dataset_H0False_ss$OR2[it],")"))+geom_point(size=2) + labs(y = "Empirical Power (ES, unblinded)", x="Correlation", color="Decision SS") + coord_cartesian(ylim = c(0.60, 1))+ geom_path()+ theme(plot.title = element_text(size=9),legend.position = c(0.8, 0.2))
#
p[[enum+4]] <-ggplot(sub, aes(x=corr, y=Test_Power_CE)) +
geom_point(size=2)+ ggtitle(paste("Scenario", dataset_H0False_ss$scenario[it], "\n (p1,p2,OR1,OR2) \n=(", dataset_H0False_ss$p0_e1[it],",",dataset_H0False_ss$p0_e2[it],",",dataset_H0False_ss$OR1[it],",",dataset_H0False_ss$OR2[it],")"))+geom_point(size=2) + labs(y = "Empirical Power (CE)", x="Correlation", color="Decision") + coord_cartesian(ylim = c(0.60, 1))+ geom_path()+ theme(plot.title = element_text(size=9))
p[[enum+5]] <-ggplot(sub, aes(x=corr, y=Test_Power_RE)) +
geom_point(size=2)+ ggtitle(paste("Scenario", dataset_H0False_ss$scenario[it], "\n (p1,p2,OR1,OR2) \n=(", dataset_H0False_ss$p0_e1[it],",",dataset_H0False_ss$p0_e2[it],",",dataset_H0False_ss$OR1[it],",",dataset_H0False_ss$OR2[it],")"))+geom_point(size=2) + labs(y = "Empirical Power (RE)", x="Correlation", color="Decision") + coord_cartesian(ylim = c(0.60, 1))+ geom_path()+ theme(plot.title = element_text(size=9))
stable <- data.frame(Corr=sub$corr,
ARE=round(sub$ARE,2),"%CE"=round(100*sub$decision_ES,2),
"%CE_u"=round(100*sub$decision_ES_ub,2),check.names=FALSE)
p[[enum+6]] <- ggtexttable(stable, rows = NULL, theme = ttheme(base_style = "default", base_size = 9))
stable <- data.frame(Corr=sub$corr,
SS=round(sub$ss_ratio,2),"%CE"=round(100*sub$decision_ES_SS,2),
"%CE_u"=round(100*sub$decision_ES_ubSS,2),check.names=FALSE)
p[[enum+7]] <- ggtexttable(stable, rows = NULL, theme = ttheme(base_style = "default", base_size = 9))
enum=enum+8
it <- it + dim(subset(dataset_H0False_ss,dataset_H0False_ss$scenario==i))
}
marrangeGrob(p,ncol=4,nrow=1,top=NULL)
# ,heights = c(5, 10)
For each combination of (\(p_1^{(0)},p_2^{(0)},OR_1,OR_2\)), we plot the empirical power for each design -Selected Endpoint with blinded estimation (SE), Selected Endpoint with unblinded estimation (SE_u), Composite Endpoint (CE), and Relevant Endpoint (RE)- according to the correlation \(\rho\). In the empirical powers for the Selected Endpoint, we plot the points in different colors depending on which endpoint would have been chosen using the ARE calculated from the true parameters (\(p_1^{(0)},p_2^{(0)},OR_1,OR_2,\rho\)).
For each combination of (\(p_1^{(0)},p_2^{(0)},OR_1,OR_2\)), we also present a table with:
p <- list()
enum = 1
it <- 1
for(i in 1:max(dataset_H0False_ss$scenario)){
sub=subset(dataset_H0False_ss,dataset_H0False_ss$scenario==i & dataset_H0False_ss$p_init==0.5)
#
p[[enum]] <-ggplot(sub, aes(x=corr, y=Test_Power_ES, color=as.factor(decision))) +
geom_point(size=2)+ ggtitle(paste("Scenario", dataset_H0False_ss$scenario[it], "\n (p1,p2,OR1,OR2) \n=(", dataset_H0False_ss$p0_e1[it],",",dataset_H0False_ss$p0_e2[it],",",dataset_H0False_ss$OR1[it],",",dataset_H0False_ss$OR2[it],")"))+geom_point(size=2) + labs(y = "Empirical Power (ES)", x="Correlation", color="Decision") + coord_cartesian(ylim = c(0.60, 1))+ geom_path()+ theme(plot.title = element_text(size=9),legend.position = c(0.8, 0.2))
p[[enum+1]] <-ggplot(sub, aes(x=corr, y=Test_Power_ES_ub, color=as.factor(decision))) +
geom_point(size=2)+ ggtitle(paste("Scenario", dataset_H0False_ss$scenario[it], "\n (p1,p2,OR1,OR2) \n=(", dataset_H0False_ss$p0_e1[it],",",dataset_H0False_ss$p0_e2[it],",",dataset_H0False_ss$OR1[it],",",dataset_H0False_ss$OR2[it],")"))+geom_point(size=2) + labs(y = "Empirical Power (ES, unblinded)", x="Correlation", color="Decision") + coord_cartesian(ylim = c(0.60, 1))+ geom_path()+ theme(plot.title = element_text(size=9),legend.position = c(0.8, 0.2))
#
p[[enum+2]] <-ggplot(sub, aes(x=corr, y=Test_Power_ES_SS, color=as.factor(ss_decision))) +
geom_point(size=2)+ ggtitle(paste("Scenario", dataset_H0False_ss$scenario[it], "\n (p1,p2,OR1,OR2) \n=(", dataset_H0False_ss$p0_e1[it],",",dataset_H0False_ss$p0_e2[it],",",dataset_H0False_ss$OR1[it],",",dataset_H0False_ss$OR2[it],")"))+geom_point(size=2) + labs(y = "Empirical Power (ES)", x="Correlation", color="Decision SS") + coord_cartesian(ylim = c(0.60, 1))+ geom_path()+ theme(plot.title = element_text(size=9),legend.position = c(0.8, 0.2))
p[[enum+3]] <-ggplot(sub, aes(x=corr, y=Test_Power_ES_ubSS, color=as.factor(ss_decision))) +
geom_point(size=2)+ ggtitle(paste("Scenario", dataset_H0False_ss$scenario[it], "\n (p1,p2,OR1,OR2) \n=(", dataset_H0False_ss$p0_e1[it],",",dataset_H0False_ss$p0_e2[it],",",dataset_H0False_ss$OR1[it],",",dataset_H0False_ss$OR2[it],")"))+geom_point(size=2) + labs(y = "Empirical Power (ES, unblinded)", x="Correlation", color="Decision SS") + coord_cartesian(ylim = c(0.60, 1))+ geom_path()+ theme(plot.title = element_text(size=9),legend.position = c(0.8, 0.2))
#
p[[enum+4]] <-ggplot(sub, aes(x=corr, y=Test_Power_CE)) +
geom_point(size=2)+ ggtitle(paste("Scenario", dataset_H0False_ss$scenario[it], "\n (p1,p2,OR1,OR2) \n=(", dataset_H0False_ss$p0_e1[it],",",dataset_H0False_ss$p0_e2[it],",",dataset_H0False_ss$OR1[it],",",dataset_H0False_ss$OR2[it],")"))+geom_point(size=2) + labs(y = "Empirical Power (CE)", x="Correlation", color="Decision") + coord_cartesian(ylim = c(0.60, 1))+ geom_path()+ theme(plot.title = element_text(size=9))
p[[enum+5]] <-ggplot(sub, aes(x=corr, y=Test_Power_RE)) +
geom_point(size=2)+ ggtitle(paste("Scenario", dataset_H0False_ss$scenario[it], "\n (p1,p2,OR1,OR2) \n=(", dataset_H0False_ss$p0_e1[it],",",dataset_H0False_ss$p0_e2[it],",",dataset_H0False_ss$OR1[it],",",dataset_H0False_ss$OR2[it],")"))+geom_point(size=2) + labs(y = "Empirical Power (RE)", x="Correlation", color="Decision") + coord_cartesian(ylim = c(0.60, 1))+ geom_path()+ theme(plot.title = element_text(size=9))
stable <- data.frame(Corr=sub$corr,
ARE=round(sub$ARE,2),"%CE"=round(100*sub$decision_ES,2),
"%CE_u"=round(100*sub$decision_ES_ub,2),check.names=FALSE)
p[[enum+6]] <- ggtexttable(stable, rows = NULL, theme = ttheme(base_style = "default", base_size = 9))
stable <- data.frame(Corr=sub$corr,
SS=round(sub$ss_ratio,2),"%CE"=round(100*sub$decision_ES_SS,2),
"%CE_u"=round(100*sub$decision_ES_ubSS,2),check.names=FALSE)
p[[enum+7]] <- ggtexttable(stable, rows = NULL, theme = ttheme(base_style = "default", base_size = 9))
enum=enum+8
it <- it + dim(subset(dataset_H0False_ss,dataset_H0False_ss$scenario==i))
}
marrangeGrob(p,ncol=4,nrow=1,top=NULL)
# ,heights = c(5, 10)
p <- list()
enum = 1
it <- 1
for(i in 1:max(dataset_SSadd$scenario)){
sub=subset(dataset_SSadd,dataset_SSadd$scenario==i & dataset_SSadd$p_init==1.0)
p[[enum]] <-ggplot(sub, aes(x=corr, y=Test_Power_ES_SS, color=as.factor(ss_decision))) +
geom_point(size=2)+ ggtitle(paste("Scenario", dataset_SSadd$scenario[it], "\n (p1,p2,OR1,OR2) \n=(", dataset_SSadd$p0_e1[it],",",dataset_SSadd$p0_e2[it],",",dataset_SSadd$OR1[it],",",dataset_SSadd$OR2[it],")"))+geom_point(size=2) + labs(y = "Empirical Power (ES)", x="Correlation", color="Decision SS") + coord_cartesian(ylim = c(0, 1))+ geom_path()+ theme(plot.title = element_text(size=9),legend.position = c(0.8, 0.2))
p[[enum+1]] <-ggplot(sub, aes(x=corr, y=Test_Power_ES_ubSS, color=as.factor(ss_decision))) +
geom_point(size=2)+ ggtitle(paste("Scenario", dataset_SSadd$scenario[it], "\n (p1,p2,OR1,OR2) \n=(", dataset_SSadd$p0_e1[it],",",dataset_SSadd$p0_e2[it],",",dataset_SSadd$OR1[it],",",dataset_SSadd$OR2[it],")"))+geom_point(size=2) + labs(y = "Empirical Power (ES, unblinded)", x="Correlation", color="Decision SS") + coord_cartesian(ylim = c(0, 1))+ geom_path()+ theme(plot.title = element_text(size=9),legend.position = c(0.8, 0.2))
stable <- data.frame(Corr=sub$corr,
SS=round(sub$ss_ratio,2),"%CE"=round(100*sub$decision_ES_SS,2),
"%CE_u"=round(100*sub$decision_ES_ubSS,2),check.names=FALSE)
p[[enum+2]] <- ggtexttable(stable, rows = NULL, theme = ttheme(base_style = "default", base_size = 9))
enum=enum+3
it <- it + dim(subset(dataset_SSadd,dataset_SSadd$scenario==i))
}
marrangeGrob(p,ncol=3,nrow=1,top=NULL)
p <- list()
enum = 1
it <- 1
for(i in 1:max(dataset_SSadd$scenario)){
sub=subset(dataset_SSadd,dataset_SSadd$scenario==i & dataset_SSadd$p_init==0.5)
p[[enum]] <-ggplot(sub, aes(x=corr, y=Test_Power_ES_SS, color=as.factor(ss_decision))) +
geom_point(size=2)+ ggtitle(paste("Scenario", dataset_SSadd$scenario[it], "\n (p1,p2,OR1,OR2) \n=(", dataset_SSadd$p0_e1[it],",",dataset_SSadd$p0_e2[it],",",dataset_SSadd$OR1[it],",",dataset_SSadd$OR2[it],")"))+geom_point(size=2) + labs(y = "Empirical Power (ES)", x="Correlation", color="Decision SS") + coord_cartesian(ylim = c(0, 1))+ geom_path()+ theme(plot.title = element_text(size=9),legend.position = c(0.8, 0.2))
p[[enum+1]] <-ggplot(sub, aes(x=corr, y=Test_Power_ES_ubSS, color=as.factor(ss_decision))) +
geom_point(size=2)+ ggtitle(paste("Scenario", dataset_SSadd$scenario[it], "\n (p1,p2,OR1,OR2) \n=(", dataset_SSadd$p0_e1[it],",",dataset_SSadd$p0_e2[it],",",dataset_SSadd$OR1[it],",",dataset_SSadd$OR2[it],")"))+geom_point(size=2) + labs(y = "Empirical Power (ES, unblinded)", x="Correlation", color="Decision SS") + coord_cartesian(ylim = c(0, 1))+ geom_path()+ theme(plot.title = element_text(size=9),legend.position = c(0.8, 0.2))
stable <- data.frame(Corr=sub$corr,
SS=round(sub$ss_ratio,2),"%CE"=round(100*sub$decision_ES_SS,2),
"%CE_u"=round(100*sub$decision_ES_ubSS,2),check.names=FALSE)
p[[enum+2]] <- ggtexttable(stable, rows = NULL, theme = ttheme(base_style = "default", base_size = 9))
enum=enum+3
it <- it + dim(subset(dataset_SSadd,dataset_SSadd$scenario==i))
}
marrangeGrob(p,ncol=3,nrow=1,top=NULL)
The following boxplots show the significance level obtained using the composite endpoint (CE), the endpoint 1 (RE) or the selected endpoint (SE).
signlevel_data <- data.frame(signlevel=c(dataset_H0True$Test_Reject_ES,
dataset_H0True$Test_Reject_ES_ub,
dataset_H0True$Test_Reject_ES_SS,
dataset_H0True$Test_Reject_ES_ubSS,
dataset_H0True$Test_Reject_CE,dataset_H0True$Test_Reject_RE),
Endpoint=c(rep("SE",length(dataset_H0True$Test_Reject_ES)),
rep("SE(ub)",length(dataset_H0True$Test_Reject_ES_ub)),
rep("SE (ss/b)",length(dataset_H0True$Test_Reject_ES_SS)),
rep("SE (ss/ub)",length(dataset_H0True$Test_Reject_ES_ubSS)),
rep("CE",length(dataset_H0True$Test_Reject_CE)),
rep("RE",length(dataset_H0True$Test_Reject_RE)))
)
ggplot(signlevel_data, aes(x=Endpoint, y=signlevel)) + geom_boxplot()
General summary:
subset_h0 <- data.frame(Test_Reject_ES=dataset_H0True$Test_Reject_ES,
Test_Reject_ES_ub=dataset_H0True$Test_Reject_ES_ub,
Test_Reject_ES_SS=dataset_H0True$Test_Reject_ES_SS,
Test_Reject_ES_ubSS=dataset_H0True$Test_Reject_ES_ubSS,
Test_Reject_CE=dataset_H0True$Test_Reject_CE,
Test_Reject_RE=dataset_H0True$Test_Reject_RE
)
summary(subset_h0)
## Test_Reject_ES Test_Reject_ES_ub Test_Reject_ES_SS Test_Reject_ES_ubSS
## Min. :0.04810 Min. :0.04817 Min. :0.04813 Min. :0.04798
## 1st Qu.:0.04936 1st Qu.:0.04991 1st Qu.:0.04938 1st Qu.:0.04977
## Median :0.04985 Median :0.05072 Median :0.04983 Median :0.05072
## Mean :0.04990 Mean :0.05133 Mean :0.04986 Mean :0.05137
## 3rd Qu.:0.05044 3rd Qu.:0.05214 3rd Qu.:0.05029 3rd Qu.:0.05223
## Max. :0.05167 Max. :0.06368 Max. :0.05195 Max. :0.06459
## Test_Reject_CE Test_Reject_RE
## Min. :0.04749 Min. :0.04689
## 1st Qu.:0.04918 1st Qu.:0.04873
## Median :0.04972 Median :0.04934
## Mean :0.04968 Mean :0.04932
## 3rd Qu.:0.05019 3rd Qu.:0.04981
## Max. :0.05153 Max. :0.05158
Summary cases type 1 error rate\(>0.055\) with SE(ub):
summary(subset(dataset_H0True,dataset_H0True$Test_Reject_ES_ubSS>0.055)[,c(1:5,15,16,18,19,23)])
## p0_e1 p0_e2 OR1 OR2
## Min. :0.1000 Min. :0.1000 Min. :0.6 Min. :0.7500
## 1st Qu.:0.1000 1st Qu.:0.1000 1st Qu.:0.6 1st Qu.:0.7500
## Median :0.1000 Median :0.2500 Median :0.6 Median :0.7500
## Mean :0.1235 Mean :0.1794 Mean :0.6 Mean :0.7706
## 3rd Qu.:0.1000 3rd Qu.:0.2500 3rd Qu.:0.6 3rd Qu.:0.8000
## Max. :0.2000 Max. :0.2500 Max. :0.6 Max. :0.8000
## p_init corr p0_ce OR_ce
## Min. :0.5000 Min. :0.0000 Min. :0.1540 Min. :0.6338
## 1st Qu.:0.5000 1st Qu.:0.1000 1st Qu.:0.1900 1st Qu.:0.6690
## Median :0.5000 Median :0.2000 Median :0.2600 Median :0.6884
## Mean :0.6471 Mean :0.2118 Mean :0.2551 Mean :0.6923
## 3rd Qu.:1.0000 3rd Qu.:0.3000 3rd Qu.:0.3120 3rd Qu.:0.7184
## Max. :1.0000 Max. :0.5000 Max. :0.3654 Max. :0.7408
## ARE ss_ratio
## Min. :0.8032 Min. :0.948
## 1st Qu.:0.8847 1st Qu.:1.013
## Median :0.9118 Median :1.019
## Mean :0.9070 Mean :1.019
## 3rd Qu.:0.9387 3rd Qu.:1.031
## Max. :0.9769 Max. :1.118
We ran an extra simulation to assess the correlation estimator in both blinded and unblinded approaches. Using the same scenarios we considered before, we simulate each scenario \(nsim = 1000\) times. For each run, we estimate the correlation using the blinded and unblinded approach. We then compute the mean and variance of these estimators. We compute the bias as the difference between the mean value and the true value of the correlation in that scenario.
Scatterplots estimated correlation vs true correlation:
ggplot(dataset, aes(x=corr, y=corr_est_ub_mean)) + geom_point(size=2) + labs(y = "Mean value correlation estimates (Unblinded approach)", x="True correlation")
ggplot(dataset, aes(x=corr, y=corr_est_b_mean)) + geom_point(size=2) + labs(y = "Mean value correlation estimates (Blinded approach)", x="True correlation")
Summary bias:
# summary(dataset$corr)
# summary(dataset[,31:36])
summary(dataset[,35:36])
## Test_Reject_CE Test_Reject_RE
## Mode:logical Mode:logical
## NA's:242 NA's:242
In what follows we compare the statistical powers obtained using the composite endpoint with the ones obtained using the endpoint 1 as primary endpoint. We also evaluate the concordance between the choice of the primary endpoint -according to the ARE value and the ratio of the sample sizes- and the difference in powers obtained in the simulation. In this case, we calculate the ARE at the beginning of the study, and then according to the parameters (\(p_1^{(0)},p_2^{(0)},OR_1,OR_2,\rho\)).
dataset_H0False$diff_powers = dataset_H0False$Test_Power_CE - dataset_H0False$Test_Power_RE
The following plot shows the relationship between the decision criteria and the difference in power between using the composite endpoint (CE) or using the endpoint 1 (RE).
Decision based on the ARE:
ggplot(dataset_H0False, aes(x=ARE, y=diff_powers, color=as.factor(decision))) + geom_point(size=2) + labs(y = "Difference in power (Power CE - Power RE)", x="ARE", color="Decision")
Decision based on the ratio of sample sizes:
ggplot(dataset_H0False, aes(x=ss_ratio, y=diff_powers, color=as.factor(ss_decision))) + geom_point(size=2) + labs(y = "Difference in power (Power CE - Power RE)", x="Ratio SS", color="Decision")
Cases in which there is no concordance between the decision and the power gain:
dataset_H0False$gain_power = ifelse(dataset_H0False$diff_powers>0, "CE", "RE")
dataset_H0False$gain_power = as.factor(dataset_H0False$gain_power)
dataset_H0False$decision = as.factor(dataset_H0False$decision)
summary(subset(dataset_H0False,dataset_H0False$gain_power != dataset_H0False$decision)[,c(3,4,18,19,20,21,22)])
## OR1 OR2 OR_ce ARE decision
## Min. :0.6 Min. :0.750 Min. :0.6338 Min. :0.8664 CE: 0
## 1st Qu.:0.6 1st Qu.:0.750 1st Qu.:0.6669 1st Qu.:0.9118 RE:20
## Median :0.6 Median :0.750 Median :0.6787 Median :0.9351
## Mean :0.6 Mean :0.765 Mean :0.6883 Mean :0.9359
## 3rd Qu.:0.6 3rd Qu.:0.800 3rd Qu.:0.7184 3rd Qu.:0.9657
## Max. :0.6 Max. :0.800 Max. :0.7350 Max. :0.9935
## samplesize_e1 samplesize_ce
## Min. : 714 Min. : 658.5
## 1st Qu.: 714 1st Qu.: 700.5
## Median :1335 Median :1223.0
## Mean :1149 Mean :1091.4
## 3rd Qu.:1335 3rd Qu.:1303.5
## Max. :1335 Max. :1317.5
Conclusions:
Future steps:
Lefkopoulou, M., & Ryan, L. (1993). Global Tests for Multiple Binary Outcomes. Biometrics.↩︎
Legler, J. M., Lefkopoulou, M., & Ryan, L. (1995). Efficiency and Power of Tests for Multiple Binary Outcomes. JASA.↩︎
Cuzick, J. (1982). The Efficiency of the Proportions Test and the Logrank Test for Censored Survival Data. Biometrics.↩︎